### Amendments to the Drawings

See Attached Drawings



FIG. 1

100 +







FIG. 3





FIG. 4





## Forwarding from RFC

| Data Execution Store (E) (S) |                | R2 st [MA1], R2 st [MA | Cycle Address Data Sec'y C |
|------------------------------|----------------|------------------------|----------------------------|
| Stor<br>(S)                  | 2+ [14441] D2  |                        | Sec'y Data Execution (G)   |
|                              | D <sub>3</sub> |                        |                            |
|                              |                |                        | Store Buffers              |

Note: The load/store instructions specify the same virtual load/store address. Clock 4, E-stage: Result data destined for memory also written into RFC. Clock 5, G-stage: Result data specified by load instruction forwarded from RFC.



## FIG. 6

## Forwarding from Store Buffer

| Cycle | Address        | Data           | Sec'y Data     | Execution      | Store          | Write-back                  |
|-------|----------------|----------------|----------------|----------------|----------------|-----------------------------|
| ,     | (A)            | (D)            | (G)            | (E)            | (S)            |                             |
| _     | st [MA1], R2   |                |                |                |                |                             |
| 2     | st [MA1], R3   | st [MA1], R2   |                |                |                |                             |
| သ     | add R4, R5, R4 | st [MA1], R3   | st [MA1], R2   |                |                |                             |
| 4     | add R7, R8, R7 | add R4, R5, R4 | st [MA1], R3   | st [MA1], R2   |                |                             |
| 5     | sub R4, R5, R4 | add R7, R8, R7 | add R4, R5, R4 | $\overline{}$  | st [MA1], R2   | Ì                           |
| 6     | sub R7, R8, R7 | sub R4, R5, R4 | add R7, R8, R7 |                | st [MA1], R3   | ş                           |
| 7     | ld R9, [MA1]   | sub R7, R8, R7 | sub R4, R5, R4 | add R7, R8, R7 |                | st [MA1], R3                |
| 8     |                | ld R9, [MA1]   | sub R7, R8, R7 | sub R4, R5, R4 | add R7, R8, R7 | add R4, R5, R4 st [MA1], R3 |
| 9     |                |                | ld R9, [MA1]   | sub R7, R8, R7 | sub R4, R5, R4 | add R7, R8, R7              |
| 10    |                |                |                | ld R9, [MA1]   | sub R7, R8, R7 | sub R4, R5, R4              |
| 11    |                |                |                |                | ld R9, [MA1]   | sub R7, R8, R7              |
| 12    |                |                |                |                |                | Id R9, [MA1]                |

Note: The load/store instructions specify the same load/store address.

Clock 7, E-stage: Result data of first store instruction from R2 destined for memory written into first store buffer.

Clock 8, E-stage: Result data of second store instruction from R3 destined for memory written into second store buffer.

Clock 9, G-stage: Result data specified by load instruction forwarded from second store buffer.



# Speculative Forwarding with Correction Due to Virtual Aliasing Condition

| n+4          | n+3          | n+2          | n+1          | 3            | n-1          |  | 7            | 6            | 5            | 4            | ယ            | 2            | _            |      | Cycle         |
|--------------|--------------|--------------|--------------|--------------|--------------|--|--------------|--------------|--------------|--------------|--------------|--------------|--------------|------|---------------|
|              |              |              |              |              |              |  |              |              |              |              |              | ld R4, [MA2] | st [MA1], R2 | (A)  | Address       |
|              |              |              |              | ld R4, [MA2] |              |  |              |              |              |              | ld R4, [MA2] | st [MA1], R2 |              | (D)  | Data          |
|              |              |              | ld R4, [MA2] |              |              |  |              |              |              | ld R4, [MA2] | st [MA1], R2 |              |              | (G)  | Sec'y Data    |
|              |              | ld R4, [MA2] |              |              | ld R4, [MA2] |  | ld R4, [MA2] | ld R4, [MA2] | ld R4, [MA2] | st [MA1], R2 |              |              |              | (E)  | Execution     |
|              | ld R4, [MA2] |              |              |              |              |  |              |              | st [MA1], R2 |              |              |              |              | (S)  | Store         |
| ld R4, [MA2] |              |              |              | _            |              |  |              | st [MA1], R2 |              |              |              |              |              | (WB) | Write-back    |
|              |              |              |              |              |              |  | st [MA1], R2 |              |              |              |              |              |              | (SB) | Store Buffers |

Note: The load/store instructions specify different virtual load/store addresses that translate to the same physical address. Clock 5, E-stage: Data from data unit speculatively forwarded to E-stage load instruction. Clocks 5 through n-1, E-stage: Load instruction in stalled E-stage while store instruction data written to data cache. Clock 7, Store Buffers: Result data of store instruction destined for memory written into store buffer.

Clock n, D-stage: Load instruction reissued within data unit.

Clock n+1, G-stage: Load instruction generates hit in data cache.



## FIG. 8

# Speculative Forwarding with Correction Due to Access of Non-Cacheable Region

| n+2          | n+1          | n            | n-1          |   | 7            | 6            | 51           | 4            | ω            | 2            | _            |      | Cycle         |
|--------------|--------------|--------------|--------------|---|--------------|--------------|--------------|--------------|--------------|--------------|--------------|------|---------------|
|              |              |              |              |   |              |              |              |              | ld R3, [MA1] | nop          | st [MA1], R2 | (A)  | Address       |
|              |              |              |              |   |              |              |              | ld R3, [MA1] | nop          | st [MA1], R2 |              | (D)  | Data          |
|              |              |              |              |   |              |              | ld R3, [MA1] | nop          | st [MA1], R2 |              |              | (G)  | Sec'y Data    |
|              |              | ld R3, [MA1] | ld R3, [MA1] |   | ld R3, [MA1] | ld R3, [MA1] | nop          | st [MA1], R2 |              |              |              | (E)  | Execution     |
|              | ld R3, [MA1] |              |              | - |              | nop          | st [MA1], R2 |              |              |              |              | (S)  | Store         |
| ld R3. [MA1] |              |              |              |   | nop          | st [MA1], R2 |              |              |              | -            |              | (WB) | Write-back    |
|              |              |              |              |   | st [MA1], R2 |              |              |              |              |              |              | (SB) | Store Buffers |

N te: The load/store instructions specify the same virtual load/store address which translates to a physical address in a noncacheable region.

Clock 6, E-stage: Storehit data from RFC speculatively forwarded to E-stage load instruction.

Clocks 6 through n-1, E-stage: Load instruction stalled in E-stage while data specified by load instruction is fetched from I/O

device or system memory into a response buffer.

Clock n, E-stage: Data requested by load instruction selected from response buffer.